[Target] PDFs from website [Method] Automation [UI/UX] N/A [Stack] Python (requests, BeautifulSoup), Selenium, Pandas [Security] N/A [Format] JSON

freelancer.com 🟠 2026-05-07

🔹 [Target] PDFs from website [Method] Automation [UI/UX] N/A [Stack] Python (requests, BeautifulSoup), Selenium, Pandas [Security] N/A [Format] JSON
👤 Client: 🇮🇳 DELHI, India Member since 2021-05-15
💰 Price: $90 Average bid
🚩 Problem: Automate the download of over 7,000 PDF documents and extract specific text fields into an Excel template.
📦 Existing: Not specified

Specifications:

[Target] Download more than 7,000 PDFs from a public website.
[Method] Use Python with libraries like requests, BeautifulSoup for scraping, and Selenium to handle dynamic content. Automate the process using scripts.
[Stack] Python (requests, BeautifulSoup, Pandas), Selenium
[Security] No specific security measures required as the site is password-free.

Workflow:

1. Set up a reliable VPN connection for consistent access.
2. Develop a script to navigate through the website and download PDFs into folders and subfolders.
3. Extract text fields from each PDF using OCR if necessary, or by parsing HTML content if available.
4. Populate the provided Excel template with extracted data ensuring accuracy and consistency.
5. Validate all entries for errors and ensure compliance with column order and validation rules.

⚡ Receive notifications instantly Join our community.

Discord Telegram

Our Social Networks

LinkedIn Twitter Facebook

🕷️️ Job Radar • SCRAPING